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Abstract 

We describe a new methodology that enables the di- 
rect execution of multi-threaded applications inside of 
Shadow, an existing parallel discrete-event network sim- 
ulation framework. Our methodology utilizes function 
interposition and an application-layer thread library to 
emulate the ordinary thread interface to the application. 
Using this methodology, we implement a new Shadow 
plug-in that directly executes the Bitcoin reference client 
software. To demonstrate the usefulness of this tool, we 
present novel denial-of-service attacks against the Bit- 
coin software that exploit low-level implementation ar- 
tifacts in the Bitcoin reference client; our determinis- 
tic simulator was helpful in developing and demonstrat- 
ing these attacks. We describe optimizations that enable 
scalable execution of thousands of Bitcoin nodes on a 
single machine, and discuss how to model the Bitcoin 
network for experimental purposes. 

1 Introduction 

Experimentation testbeds for distributed systems and 
peer-to-peer networks such as Bitcoin, Bittorrent, and 
Tor, are beneficial to the scientific community as they 
simplify the code debugging and testing process, reduce 
time to deployment of new features and protocol modi- 
fications, and promote the research and development of 
new protocols and architectural modifications. However, 
testbeds like PlanetLab and the Bitcoin testnet do not 
scale gracefully, are hard to manage and maintain, and 
do not offer as much control over experimental topology 
and node configurations as is possible under alternative 
experimentation techniques. As a result, developers and 
researchers are often unable to realize the full potential 
of the experimental method, and new code and design 
modifications are often accepted into mainline software 
without fully understanding their effects on the existing, 
often critical infrastructure. 

Alternative approaches to experimentation offer a 
unique set of benefits over the use of a distributed 
testbed. In particular, emulation may provide better seal- 
ability and improve management of and control over the 
network model and node configuration, and simulation 
may further allow for more efficient execution and re- 


peatable experiments. Shadow [2, 24] provides an inter- 
esting and unique alternative to traditional experimenta- 
tion techniques. At its core. Shadow is a simulator; the 
operating system, network stack, internetwork topology, 
and communication between nodes are all simulated us- 
ing a discrete-event engine. However, each virtual host 
in Shadow runs real application software, such as the net- 
work’s official reference client or alternate implementa- 
tions. This unique simulation/emulation hybrid allows 
Shadow to provide the most efficient experimentation 
platform possible while remaining true to application- 
layer effects of the software executed by the virtual hosts. 
This unique approach is ideal for experimenting with 
large distributed systems and peer-to-peer networks. 

Unfortunately, Shadow does not yet natively support 
virtual hosts that fork processes or run multi-threaded 
software due to the non-trivial layer of complexity 
added to Shadow’s own internal multi-threaded simula- 
tion core. As a result, many distributed systems, includ- 
ing Bitcoin, are not amenable to simulation in Shadow. 

In this work, we extend the state of the art in this 
unique simulation/emulation space by designing and im- 
plementing a simulation architecture that allows the di- 
rect execution of multi-threaded software. As a proof 
of concept of the efficacy of our approach, we de- 
sign, implement, and test a new Shadow plug-in that di- 
rectly executes the multi-threaded Bitcoin software in- 
side the Shadow simulation framework. Our novel ap- 
proach utilizes GNU Portable Threads (a.k.a., Pth) [1], 
an application-layer library that provides non-preemptive 
priority-based scheduling for multiple threads of exe- 
cution. Pth runs in a single operating system thread 
while providing the facilities to emulate the Pthreads 
(posix threads) interface to the application. We then 
use function interposition to redirect Pthreads function 
calls made from the virtual host to Pth, while allow- 
ing Pthreads calls initiated by Shadow itself to be for- 
warded to and handled by the Pthreads library. Using 
our techniques, multi-threaded application software run- 
ning in virtual hosts will function as intended, while the 
virtual host threads will not interfere with Shadow’s in- 
ternal threading engine. We envision that our approach 
will be ported to Shadow core so that all existing and 


future Shadow plug-ins may benefit. 

Using our new Bitcoin Shadow plug-in 1 , we demon- 
strate and measure the cost and effectiveness of novel 
vulnerabilities in the Bitcoin software. We also show 
how to model the Bitcoin network and how to optimize 
the bootstrapping of a Bitcoin network. 

Our major contributions are as follows: 

• A new approach that utilizes Pth and function interpo- 
sition to allow direct execution of multi-threaded appli- 
cations in the Shadow simulator. 

• The design and implementation of a new Shadow plug- 
in that uses our techniques to directly execute the multi- 
threaded Bitcoin software. 

• The description of new attacks against Bitcoin, and an 
evaluation and measurement of these attacks using our 
new techniques and tools in a safe, private Shadow envi- 
ronment. 

• A Bitcoin model that can be used to efficiently boot- 
strap and instantiate a large Bitcoin test network in 
Shadow. 

2 Background and Related Work 

This section provides background on the Shadow simula- 
tion framework, while outlining related experimentation 
work for and previous attacks on Bitcoin. 

2.1 Shadow 

Shadow [2, 24] is a parallel discrete-event network sim- 
ulator. Shadow has a modular architecture that is broken 
into two major components: (1) the core simulator, and 
(2) software run by virtual hosts, which are dynamically 
loaded at run time as plug-in libraries. 

2.1.1 The Core Simulator 

Shadow itself is, at its core, a simulator. In addition to the 
parallel event engine. Shadow contains the logic required 
to simulate both the internetwork topology over which its 
virtual hosts will communicate as well as the operating 
system base upon which virtual hosts will run. 

Event Engine. Because Shadow is a simulator, it re- 
places the concept of real time with its own simulation 
time over which it has precise control. Every action 
that happens in Shadow, such as starting virtual host ap- 
plications or sending and receiving packets, is initiated 
from an event that occurs at a precise time instant (with 
nanosecond granularity). Shadow’s event engine runs 
these events in the correct chronological order, while ad- 
hering to real-world characteristics such as network de- 
lay and loss. Shadow’s event engine can benefit from the 
use of multiple worker threads. 

Topology. Shadow uses the standard GraphML format 
to represent the connectivity and properties of links be- 

1 The code for our simulator is made freely available at https : 
/ / github . com/shadow/ shadow-plugin-bitcoin 


tween each virtual host running in a simulation. This 
topology contains both vertices and edges: vertices rep- 
resent Internet points-of-presence at which virtual hosts 
can be connected; and edges represent the path between 
those points-of-presence and their properties, including 
latency, jitter, and packet loss. Shadow models the In- 
ternet using real data available from public sources, like 
CAIDA and Netlndex. Every packet sent between two 
virtual hosts will be subject to the properties of the edges 
over which the packet travels, leading to communication 
characteristics that are not unlike those that would be re- 
alized between those locations on the Internet. 
Operating System. Each virtual host in Shadow runs 
a simulated operating system (OS), including sockets, 
pipes, network protocols (TCP and UDP), timers, asyn- 
chronous event facilities, network interfaces, and various 
data buffers. These mechanisms are implemented to sup- 
port the Linux POSIX interface and provide the function- 
ality expected by virtual host software. Note that only 
the mechanisms that would affect the simulation, such as 
time or network communication, must be implemented; 
many system functions, such as file I/O, can be handled 
directly by libc as usual. Shadow uses function inter- 
position to intercept calls made from virtual host soft- 
ware to OS functions, and redirects them to their sim- 
ulated counterparts as required. In this way. Shadow is 
emulating a Linux environment to the application, which 
need not be aware that it is being simulated. 

2.1.2 Host Software Plug-ins 

As mentioned above, each virtual host contains a simu- 
lated OS that emulates a POSIX API to the application. 
The real software applications that run in Shadow are 
themselves compiled as Shadow plug-ins and loaded at 
run-time. During the compilation process, LLVM [4,31] 
is used to inject a hook function that is used by Shadow 
to pass control into the application code in order to, e.g., 
call the main function in the application and notify the 
plug-in of available input/output on file descriptors. 

Whenever Shadow instantiates a new virtual node that 
runs a particular plug-in. Shadow creates a new copy 
of the plug-in’s memory heap and stores it internally. 
Shadow expects the plug-in to provide an interface in the 
form of an on event (e) function, which Shadow in- 
vokes whenever an IO event e is available. The plug-in 
indicates which events it is interested in by using epoll 
library functions, which Shadow intercepts. 

A Shadow plug-in that runs the Tor anonymity soft- 
ware [5] has been created [24], is maintained [3] and 
is used extensively to help explore Tor research and de- 
velopment problems [14, 19,22,23,25-29]. The largest 
known Tor test network to date contained 3600 relay 
nodes and 12000 client nodes [23]. Our work was moti- 
vated by the utility of the Tor plug-in. 
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2.2 Bitcoin Experimentation 

Bitcoin [34] is a peer-to-peer “cryptocurrency” network 
that functions as a decentralized digital currency. There 
has been a recent surge of Bitcoin-related research in- 
cluding measurement, new applications, security mod- 
els, incentive analysis, and attacks (see [11] for a com- 
prehensive survey). We describe some of the existing ex- 
perimentation frameworks, and outline several areas of 
research which we believe could utilize our simulator. 
Testbeds. Parallel to the actual bitcoin network, there 
exists a public dedicated “test” network 2 that runs a mod- 
ified version of the code. The modifications, such as fre- 
quently resetting the “mining difficulty”, are intended to 
make it easier for experimentation while also discourag- 
ing its use as an actual currency. At the time of writ- 
ing, we crawled Testnet and determined that it consists 
of approximately 250 nodes (at least an order of mag- 
nitude smaller than the actual network). Alternately, a 
“testnet-in-a-box” [20] can be run as a local instance of 
the test network. The main advantages of our Shadow- 
based simulation over Testnet is that the experimenter is 
afforded greater control over the network, while provid- 
ing an accurate simulation of the network structure. 

Several other projects, such as Simbit [13], simulate 
various aspects of the Bitcoin network. However, these 
do not run the actual bitcoind application code, but rather 
implement simplified abstractions; these may oversim- 
plify or misrepresent the actual behavior. The attacks 
we demonstrate in Section 5, in particular, make use 
of implementation-specific behavior that is not modeled 
elsewhere. 

Another form of testbed is a platform for measuring 
and interacting with the live network itself, such as Coin- 
seer [30] and Coinscope [33]. 

Attacks. A very large focus of Bitcoin-related research 
has been on Bitcoin’s weak privacy guarantees. Al- 
though the reference client takes some measures to main- 
tain privacy, such as creating a new address to store 
“change,” implementation quirks often allows one Bit- 
coin transaction to be linked to others. [32] The timing 
of information propagation can often be used to associate 
transactions with IP addresses [30], Another vector for 
deanonymization involves exploiting the mechanisms by 
which Bitcoin nodes propagate information about their 
peers. [10]. Bitcoin privacy could be improved through 
a variety of techniques such as mixing [12, 35] or by up- 
grading the Bitcoin protocol to support privacy preserv- 
ing cryptography [9,36]. 

A well-known class of attacks involves deviating from 
the default mining behavior, and can in some cases allow 
the deviating miner to profit disproportionately. [7, 17,18] 


2 see https : / /en. bit coin. it/wiki/Testnet 


Another well-known class of attacks involves “double- 
spending” by convincing a victim that a payment transac- 
tion is (or will be imminently) accepted by the network, 
when in reality the attacker has ensured that a conflict- 
ing transaction will be accepted first [8]. So-called “fast 
payment” attacks exploit weaknesses of Bitcoin’s infor- 
mation propagation mechanism [15]. We illustrate how 
our simulator can be used to model information propaga- 
tion in Bitcoin. 

Researchers have recently demonstrated that an attack 
that fills up a node’s address list so that it eventually only 
connects to the attacker’s nodes [21], This attack and 
the vulnerabilities it exploits are unrelated to ours. They 
demonstrate and evaluate their attack against a “victim” 
node that they connect to the live network, and propose 
but do not evaluate several potential countermeasures; 
we believe an implementation of this attack in our sim- 
ulator would be a good way to evaluate potential coun- 
termeasures and study their interactions within the entire 
network. 

Another recently published denial-of-service attack 
involves exploiting the address propagation mechanism 
to exhaust a node’s memory, causing it to crash [10]. The 
attack we demonstrate uses an entirely different mecha- 
nism, but has a similar effect. 

3 Direct Execution of Multi-Threaded 
Applications 

Shadow plugins ordinarily use an epoll-based event loop. 
This works well for systems such as Tor and Bittorrent, 
which are implemented as a single event loop and ex- 
clusively use sockets in non-blocking mode. Essentially, 
a Shadow plug-in consists of an event loop that is exe- 
cuted by the Shadow framework; each Shadow worker 
thread delivers one event to a single instance at a time. 
Shadow uses cooperative scheduling, rather than pre- 
emptive scheduling. As such, it’s assumed that each 
plug-in finishes responding to every event within a short 
time. 

This assumption does not hold for typical multi- 
threaded applications, such as Bitcoin, that create several 
OS-level threads - typically through the POSIX threads 
(Pthreads) API - and allow each thread to block when 
reading or writing to a socket. 

In this section we describe an architecture for directly 
running such applications as a Shadow plug-in. 

Pth. Pth (GNU Portable threads) [1] is a free software 
library that provides user-space threads. Pth threads are 
cooperative rather than pre-emptive. A Pth thread runs 
until it reaches a pth_yield instruction, which trans- 
fers control to the scheduler and activates another avail- 
able thread. The underlying mechanism for switching 
between threads is fairly intricate [16]; it involves intro- 
spection and self-modification of the program stack. Pth 
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on startup ( ) : 

for every event e we're waiting for: 
epoll_add (e) 

on event (e) : 
handle event e 


Figure 1 : Pseudocode for an ordinary epoll Shadow plug-in 

on startup ( ) : 

pth_create (pluginunain) ; 
pth_setpriority (LOWEST) ; 
pth_yield ( ) ; 

for each event e thread is waiting on: 
epoll_add (e) 

on event (e) : 
pth_yield ( ) ; 
epoll_clear ( ) 

for each event e thread is waiting on: 
epoll_add (e) 


Figure 2: Pseudocode for a Pth-based Shadow plug-in 

provides a substitute for the Pthreads api, as well as for 
the ordinary suite of POSIX I/O operations, such as read- 
ing and writing on files and sockets. The Pth version 
of an I/O operation ensures that the underlying file de- 
scriptor is in non-blocking mode; instead of blocking, it 
uses pth_yield to yield to the scheduler and indicates 
which events it can wait for. 

Ordinarily, the Pth scheduler will activate threads un- 
til every thread is blocked waiting for an I/O event, 
and then it will use select in blocking mode to ac- 
tually wait for an event. A plug-in using Pth directly 
would violate Shadow’s assumption that the plug-in pro- 
cesses each event quickly and then returns control back 
to Shadow. 

We take a simple approach that bypasses Pth’s block- 
ing call to select. Instead, we ensure that the “Shadow 
thread” is always available to run, but assign it the 
lowest-priority value so that it is only activated when ev- 
ery other thread is blocked. (The “main thread” of the 
application code is run in Pth thread with ordinary prior- 
ity). When the Shadow thread receives an event, it yields 
to the Pth scheduler which activates any threads that can 
now run. When no more threads can run, the Shadow 
thread inspects which events the other threads are waiting 
for, and translates these into epoll event requests, which 
Shadow recognizes. Pseudocode for this plug-in archi- 
tecture is given in Figure 2 (compare with pseudocode 
for a typical Shadow plugin in Figure 1). 

Supporting select. Pth uses select-based tools 
for manipulating file descriptors. These crucially assume 
that the file descriptor is less than 1024. However, epoll- 
based programs do not make this assumption. Shadow is 


currently unfriendly to such programs, since the virtual 
mapped file descriptors may be any large number, and in 
fact every “instance” in Shadow has a unique number. 

To fix this, we added an extra layer of mapping be- 
tween file descriptors. For each instance. Shadow main- 
tains a mapping between the local file descriptor num- 
ber (which is typically less than 1024) and actual file de- 
scriptors on the host (which will be unique among all 
instances, and therefore typically greater than 1024 in 
number). 

Interposition of Pthreads. While the approach de- 
scribed above is suitable for writing new Pth-based ap- 
plications, most existing application code is written to 
depend on the Pthreads api. Our solution is to inter- 
cept calls intended for I/O or Pthreads library, and route 
them to the appropriate Pth functions. Fortunately, Pth 
provides an emulation of the Pthreads interface, which 
we were able to use mostly intact. 

Ease of Adding New Applications. Although we fo- 
cus on Bitcoin-related plug-ins in this paper, we believe 
our framework can easily be used to simulate most other 
single- or multi-threaded applications as well, with little 
or no application-specific customization required in gen- 
eral. A limitation is that any program relying on busy- 
loops or expensive computations to avoid deadlock or 
races must be modified; our framework assumes each ac- 
tivated application thread eventually reaches a blocking 
I/O call, waits for a lock, or sleeps. Computations in 
Shadow all occur in one virtual time instant, so an appli- 
cation thread performing long computations will appear 
to run fast. Our implementation does not support multi- 
processing (i.e., fork ( ) or exec ( ) ). 

4 Simulating Bitcoin in Shadow 

We implemented the architecture above as a reusable 
plug-in “template” for simulating arbitrary multi- 
threaded applications; the template simply calls the ap- 
plication’s “main” function to create a new instance. As 
a proof-of-concept, we used this framework to build a 
Shadow plug-in for bitcoind. We now describe the archi- 
tecture of bitcoind and several further changes we needed 
to make to Shadow to support it. 

The Satoshi Client. While the Bitcoin network com- 
prises dozens of different client implementations, the de 
facto standard is the “reference client” (also known as, 
bitcoind, mainline, or the Satoshi client). There is 
arguably no authority to define an “official” client; re- 
gardless, as bitcoind remains far and away the most 
widely used client, 3 other alternative clients generally 

‘The reference client is the most popular among “reachable” 
nodes that receive incoming connections. According to https : / / 
getaddr . bitnodes . io/, which performs daily crawls of the net- 
work, recent versions of Satoshi accounts for 83% of the reachable 
nodes. BitcoinJ is likely more popular among mobile clients, which 
are often behind a firewall and do not receive incoming connections. 
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aim for full compliance with its behavior. 

The reference client was originally written by the 
pseudonymous author, Satoshi Nakamoto, and published 
to a cryptography mailing list. Since then, it has been 
maintained as a free software project. The reference 
client is written in C++, and uses a heterogeneous 
multi-threaded architecture. The basic architecture of 
bitcoind has remained unchanged, despite frequent 
version updates with optimizations and new features. It 
contains dependencies on several libraries such as Lev- 
elDB and Boost. 4 

The reference client interacts with the rest of the net- 
work by sending and receiving messages. It uses one 
connection handler thread for each connected peer, but 
a single main thread that processes messages arriving in 
a queue. It maintains eight outgoing connections by de- 
fault, and handles up to 1 17 incoming connections. This 
multithreaded architecture makes use of both blocking 
and non-blocking behavior. For example, although each 
peer connection uses a non-blocking socket, the connec- 
tion thread performs a blocking sleep for 100 millisec- 
onds in between polling the socket. 

Supporting C++ in Shadow. Shadow supports ordinary 
static variables by initializing them once, then memoiz- 
ing the initialized state to reuse later for other instances. 
This is insufficient for C++, since static objects may ex- 
ecute arbitrary code in their constructors. We modified 
Shadow with an extra LLVM pass that executes all nec- 
essary constructors each time an instance is loaded. 
Injector. In addition to the bitcoind plugin, we used 
our multi-threaded framework to easily build a special 
purpose “injector” plug-in for our experiments. The 
plug-in connects to a single node and performs only 
the minimal handshake required before sending a pay- 
load of messages from a file. This plug-in shares no 
common codebase with bitcoind whatsoever, and in- 
stead uses a free library made by Bitcoin core developer 
Jeff Garzik called PicoCoin that provides C routines for 
manipulating Bitcoin protocol messages. Using this li- 
brary, we made a small application, the injector, that 
communicates with the Bitcoin network in a very lim- 
ited way. Effectively, it connects to a node, performs 
the VERSION/VERACK handshake, delivers a payload of 
blocks and transactions, and then quits. It requires under 
250 lines of code. 

Local Sockets. We’ve taken initial steps towards sup- 
porting simulations of Bitcoin network measurement 
platforms, such as Coinscope [33], within our frame- 
work. Coinscope consists of multiple processes (each 
of which becomes a single plugin) that coordinate using 

4 We omit BerkeleyDB by compiling bitcoind without the “wallet” 
functionality that depends on it. A BerkeleyDB version change was 
involved in an accidental “fork” disaster [6] where non-updated nodes 
temporarily diverged from the network. 


local unix domain sockets (which are not currently sup- 
ported within Shadow). We implemented unix domain 
sockets as a new socket type in Shadow and are now able 
to directly execute Coinscope code. 

5 The mapOrphans Attack 

In this section, we present novel denial-of-service attacks 
that exploit vulnerabilities in the bitcoind implemen- 
tation. We describe how we used our simulator to imple- 
ment and evaluate these attacks, demonstrating that our 
simulator framework is useful for practical research. 

The mapOrphans Vulnerability. Bitcoin transactions 
form a directed graph; each transaction spends some pre- 
viously available “input” coins, and creates several new 
“output” coins that can be spent by subsequent transac- 
tions. Consider a pair of related transactions: one trans- 
action (the “child”) spends a transaction output created 
by the other (the “parent”). If a node receives these 
transactions out of order (i.e., first the child and then the 
parent), the child transaction can not be validated until 
the parent is received. To help with out-of-order arrivals 
(e.g., due to varying latency or a dropped connection), 
the reference client maintains a buffer called mapOr- 
phans. 5 Transactions with unknown parents are placed 
in this buffer, and are not validated until the parent is re- 
ceived. 

This mechanism can be exploited to circumvent 
bitcoin d’s defenses. The most computationally ex- 
pensive step in validating a transaction is checking the 
ECDSA signature. To prevent exhaustion attacks, signa- 
ture checking is deferred until all other validation steps 
are complete, and a node bans any peer that sends trans- 
actions with invalid signatures. However, when a node 
places transactions in mapOrphans, it forgets which 
peer relayed it. Thus by sending a set of (invalid) trans- 
actions out-of-order (as illustrated in Figure 3), an at- 
tacker can (at low cost to itself) cause a node to perform 
a large number of signature checks. Since bitcoind 
processes all transactions in a single main thread, the net 
result is that a victim node can be frozen until all the sig- 
natures have been processed. 

The primary constraints on this attack are the maxi- 
mum size of mapOrphans (10k transactions) and the 
maximum size of an orphan transaction (5KB, enough to 
hold 40 signatures). On a test machine (Intel Core i7, 
1.73Ghz), each signature verification took 1.7 millisec- 
onds; hence an attacker could plausibly freeze a node for 
over 10 minutes. 

Evaluation in Shadow-Bitcoin. We implemented this 
attack in our simulator to confirm its effectiveness. We 
generated a payload of transactions as described above, 
and used the injector plug-in to deliver it to an in- 
stance of bitcoind (version 0.9.2). 

5 We use mapOrphans to abbreviate mapOrphanTransactions 
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Figure 3: Transactions used in the mapOrphans DoS Attack. 
The payload transactions are invalid and mutually conflicting. 
The attacker delivers these transactions to the victim out-of- 
order: first the payloads, and then the parent. 

It was necessary to modify the bitcoind code to 
simulate the time delay of signature validation, since 
Shadow models all computation as occurring instanta- 
neously. Therefore we inserted a sleep function after 
signature verification based on the amount of time our 
measurements indicate the computation should take. 

We used our simulator to observe the effects of a 
frozen message queue on a node’s connections. We 
experimented with this by configuring the Shadow ex- 
periment script to launch nodes and form new connec- 
tions at various moments before, during, and after an 
attack. Connections that were established prior to the 
attack are still serviceable after the attack subsides. Al- 
though the stalled message queue prevents the node from 
responding to messages from it peers, the separate con- 
nection threads prevent the socket buffers from overflow- 
ing. However, new incoming connection attempts are 
dropped, since a peer times out if the initial handshake 
is not completed within 60 seconds. 

Memory Consumption Extension. While the victim’s 
main thread is busy processing invalid transactions, the 
connection handler threads continue to receive and buffer 
input from each of its peers. Each connection buffers up 
to 5 megabytes of messages; if this limit is reached, the 
connection is dropped. By using up the maximum avail- 
able connections (i.e., 125), and filling up these buffers to 
the maximum limit while a mapOrphans attack is un- 
derway, an attacker can consume up to 500+ megabytes 
of RAM. This can crash nodes with a limited (but plau- 
sible) amount of memory. 

Mitigations. We reported these vulnerabilities to the Bit- 
coin developers, who deployed a mitigation in version 
0.9.3. The mitigation reduces the size of mapOrphans 
to 500, down by a factor of 100. Also, whenever a trans- 
action is placed in mapOrphans, the identity of the peer 
who sent it is stored alongside the transaction itself. If a 
mapOrphans transaction turns out to be invalid, then 
that peer is disconnected. Whenever a peer disconnects, 
any items in mapOrphans associated with that peer are 
discarded without inspection. 

Shadow’s role. Our simulator’s faithful modeling of 
application-level behavior helped us notice errors in our 
initial implementation. For example, bitcoind pro- 
cesses signatures in a deterministic order, which we 


exploit to incur the greatest cost on the victim; also, 
bitcoind maintains a cache of previously-validated 
signatures, hence the attack must consist of entirely dis- 
tinct valid signatures. Our simulator allowed us to make 
rapid iterations while developing and testing the im- 
plementation. Additionally, the deterministic network- 
schedule simulation simplified debugging our experi- 
ment involving the victim’s peer connections. 

6 Bitcoin Network Model 

We now describe how to run thousands of simulated 
instances of bitcoind to create a realistic, full-scale 
model of the Bitcoin network. 

Bitcoin network topology. Although Shadow already 
supports existing datasets for the Internet topology, we 
must model the Bitcoin network overlay topology. 

A list of the reachable IPs on the Bitcoin network 
can be imported from publicly available snapshots from 
getaddr . bitnodes . io. We used data obtained 
through our own crawls using Coinscope [33], Our net- 
work model consists of 6081 nodes; roughly 40% of 
these are from the US, and a nearly equal amount from 
Europe. Our data only includes IPV4 addresses, al- 
though IPv6 nodes are supported by Bitcoin; according 
to getaddr . bitnodes . io data, at the time of writ- 
ing less than 4% of nodes use IPV6. 

The actual Bitcoin network forms its overlay topol- 
ogy through an intricate mechanism [33], Information 
about potential peers propagates throughout the network 
through a gossip protocol. Each node maintains a list of 
peers it knows about, and tries to maintain exactly eight 
outgoing connections; when an outgoing connection is 
dropped, the node selects a random peer from the set 
it knows about and attempts to make a new connection. 
When a new node first joins the network, it queries sev- 
eral hardcoded “seed” nodes for a small starting list. Af- 
ter forming initial connections from this list, nodes learn 
about each other by relaying “ADDR” messages contain- 
ing the IP and port of themselves and their peers. 

For simplicity, we bypassed this procedure by using 
existing bitcoind configuration options to force node 
connectivity. The data from our Coinscope crawls give 
us a set of known peer IP addresses to which each node in 
our model has connected. To downscale our network, we 
start with this 608 1 node connectivity model and then re- 
peatedly remove the least connected node (the node with 
the least number of edges) as well as the edges to and 
from that node until reaching a network with the desired 
number of nodes. Finally, we configure each node with 
8 connections from the remaining set of edges. 

The least connected node was chosen in order to mini- 
mize the number of edges that get removed from the orig- 
inal connectivity graph. We acknowledge that this over- 
simplified connectivity model likely affects the accuracy 
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of our simulated network. For example, it has been 
shown that the information propagation effectiveness can 
be influenced by even a single well-connected node [15]. 
Preliminary measurements of the Bitcoin network have 
provided evidence of many such well-connected nodes 
and that random graph models do not account for the ob- 
served network structure [33]. However, we stress that 
our primary goal is to demonstrate the flexibility we have 
in creating a topology of our choosing, and we believe 
that it is more important to understand how changes in 
a given network affect behavior than it is to precisely 
model the real network. 

Providing initial blockchain state. Each node in the 
Bitcoin network typically maintains its own copy of the 
entire blockchain. In our model network, we begin with 
all the nodes “in sync” to some prior blockchain state. 

To reduce the storage cost, we allow the simulated 
nodes to share a single copy of state files whenever pos- 
sible. The bitcoind data directory primarily consists 
of a set of block files, each of which stores up to 128 
megabytes worth of blocks; and a LevelDB database that 
maintains an index into the block files and is used to 
lookup individual blocks or transactions from disk. 

The block files are append-only rotated logs; once a 
block file reaches 128MB, it is finalized and never writ- 
ten to again. Therefore, each node only ever needs to 
write to the “newest” block file. By choosing our initial 
blockstate to correspond to the first block after a block 
file is completed, we minimize the amount of data that 
must be copied rather than aliased. Similarly, the Lev- 
elDB database consists of a number of append-only files 
that, once full, can also be aliased. 

Overall, we are able to run a 6000 node simulation 
using less than 350 gigabytes of RAM and less than 300 
gigabytes of storage. 

Transaction Propagation Experiment. We now ex- 
plore how information propagates in Bitcoin using simu- 
lated networks at various scales. 

Transactions propagate through the network using a 
three-round protocol. When a node receives a valid trans- 
action from one of its peers, it sends an INV message 
containing the transaction’s hash to each of its peers. 
When a peer receives an INV containing a transaction it 
does not know about, it requests the transaction by send- 
ing GETDATA. Finally, a node responds to GETDATA 
with the actual TX data. 

INV messages aren’t sent immediately; instead, INV 
messages are buffered for each peer, and every tenth of 
a second, one peer is selected at random and the corre- 
sponding buffer is flushed. If a node has N connections, 
then for a given peer, it takes on average 1 0/'/V seconds 
before the INV message is received. 

We instantiated a model network at block height 
120594, which corresponds to April 2011. In order to 



Figure 4: Transaction propagation in our simulated Bitcoin net- 
work. (Black: minimum, red: mean, blue: maximum). Each 
experiment was averaged over 10, 3, and 1 runs respectively). 



Figure 5: Transaction propagation in our 6000-node simulated 
Bitcoin network (zoomed to show Europe and the eastern coast 
of North America). The large triangle (Houston) indicates the 
transaction origin. The color of each point indicates the time to 
receive a transaction (averaged over 100 trials) (blue is faster, 
green and yellow are longer). 

simulate spending coins mined then, we modified the 
client to recognize a hard-coded public key and replace 
it with a default public key for which we know the corre- 
sponding private key. 

We experimented with overlay topologies containing 
1000, 2000, and 6000 nodes to determine the effect of 
network size on transaction propagation times. For each 
experiment, we generated 100 transactions, and relayed 
them through a randomly chosen entry node. The results 
from these three experiments are shown in Figure 4. In 
our model, on average, transactions take longer to prop- 
agate in a larger network. 

In Figure 5, we overlay the data for our 6000 node ex- 
periment on a map. There appears to be little geographic 
correlation with transaction propagation time, suggesting 
that application delays and the structure of the overlay 
network have a greater impact. 

Limitations. While these experiments demonstrate the 
versatility of our simulation framework, we stress that 
our initial network model does not capture many salient 
aspects of the Bitcoin network. For example, the pres- 
ence of even a small number of very well-connected Bit- 
coin nodes can have measurable impact on information 
propagation [15]. Since the network is widely known to 
indeed have well-connected nodes [33], the accuracy of 
our network model likely suffers. Improving our network 
model and validating its accuracy is ongoing work. 
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7 Conclusion 

In this paper, we introduced a new methodology that 
enables virtual hosts in the Shadow parallel discrete- 
event simulator to run multi-threaded applications. Us- 
ing this methodology, we designed and developed a 
Shadow plug-in that runs the Bitcoin reference software, 
explained how we model a Bitcoin network for testing 
purposes, and describe optimizations that enable us to 
run thousands of Bitcoin nodes in a private test net- 
work. Finally, we demonstrated the efficacy of our plug- 
in through transaction propagation experiments, and by 
demonstrating novel denial of service attacks based on 
the mapOrphans transaction processing queue. 

Lessons Learned. Through this work, we have realized 
the benefit of having access to a simulation environment 
that runs real software. Not only does our Bitcoin simu- 
lator allow us to scale to the largest Bitcoin test-network 
known to date, but it also enables rapid prototyping of 
new features and fixes. In fact, when the topology size is 
small, our experiments run in faster than real time. 

Accurate simulators, rather than simplified abstrac- 
tions, are ideal tools for studying the nuances of dis- 
tributed system software. Our work has contributed to 
our understanding that the Bitcoin peer-to-peer protocol 
is flawed and highly vulnerable, and that many potential 
vulnerabilities and exploits lie within the low level de- 
tails of the Bitcoin implementation. 

Future Work. Although we have demonstrated the use- 
fulness of our approach, and provided initial steps to- 
wards, it still remains for us to validate the accuracy of 
our network model by comparing it with measurements. 
We also hope to work with the Shadow developers to 
merge our work into Shadow core so that other multi- 
threaded applications can more easily run in Shadow. Fi- 
nally, we intend to continue studying the Bitcoin imple- 
mentation for vulnerabilities and hope to help improve 
the software through mitigation techniques that we can 
show through simulation to be effective. 
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